Linking geographic vocabularies through WordNet

نویسندگان

  • Andrea Ballatore
  • Michela Bertolotto
  • David C. Wilson
چکیده

The linked open data paradigm has emerged as a promising approach to structuring and sharing geospatial information. One of the major obstacles to this vision lies in the difficulties found in the automatic integration between heterogeneous vocabularies and ontologies that provides the semantic backbone of the growing constellation of open geo-knowledge bases. In this article, we show how to utilise WordNet as a semantic hub to increase the integration of linked open data. With this purpose in mind, we devise Voc2WordNet , an unsupervised mapping technique between a given vocabulary and WordNet, combining intensional and extensional aspects of the geographic terms. Voc2WordNet is evaluated against a sample of human-generated alignments with the OpenStreetMap Semantic Network, a crowdsourced geospatial resource, and the GeoNames ontology, the vocabulary of a large digital gazetteer. These empirical results indicate that the approach can obtain high precision and recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Small to Big Data: paper manuscripts to RDF triples of Australian Indigenous Vocabularies

This paper discusses a project to encode archival vocabularies of Australian indigenous languages recorded in the early twentieth century and representing at least 40 different languages. We explore the text with novel techniques, based on encoding them in XML with a standard TEI schema. This project allows geographic navigation of the diverse vocabularies. Ontologies for people and placenames ...

متن کامل

Identifying Cognates by Phonetic and Semantic Similarity

I present a method of identifying cognates in the vocabularies of related languages. I show that a measure of phonetic similarity based on multivalued features performs better than “orthographic” measures, such as the Longest Common Subsequence Ratio (LCSR) or Dice’s coefficient. I introduce a procedure for estimating semantic similarity of glosses that employs keyword selection and WordNet. Te...

متن کامل

Publishing Reference Geodata on the Web: Opportunities and Challenges for IGN France

The French national mapping agency (IGN) produces several different but complementary geographic vector reference databases delivered in traditional GIS formats. However, linked data users have different expectations and habits, such as the need to browse an entire data catalogue in RDF using the ”follow-your-nose” navigation capacity from one graph to another. Besides, traditional GIS data for...

متن کامل

Anchoring Dutch Cultural Heritage Thesauri to WordNet: Two Case Studies

In this paper, we argue on the interest of anchoring Dutch Cultural Heritage controlled vocabularies to WordNet, and demonstrate a reusable methodology for achieving this anchoring. We test it on two controlled vocabularies, namely the GTAA thesaurus, used at the Netherlands Institute for Sound and Vision (the Dutch radio and television archives), and the GTT thesaurus, used to index books of t...

متن کامل

The discourse of data: exploring data-related vocabularies in geographic information systems description

Various ideas of data have emerged, expressed in practice through distinct vocabularies of data-related terms. This article develops a six-category taxonomy of these vocabularies, and illustrates how their terms are utilized in texts which relate to geographic information systems (GIS) in general, and to the HYDRA5 water catchment modeling system developed for the Sydney Water Corporation in pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Annals of GIS

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2014